Molecular Systems Biology — Latest Matching Preprints

1

Global Protein-Turnover Quantification in Escherichia coli Reveals Cytoplasmic Recycling under Nitrogen-Limitation

Gupta, M.; Johnson, A.; Cruz, E.; Costa, E.; Guest, R. L.; Li, S. H.-J.; Hart, E. M.; Nguyen, T.; Stadlmeier, M.; Bratton, B. P.; Silhavy, T. J.; Wingreen, N. S.; Gitai, Z.; Wuhr, M.

2023-05-24 systems biology 10.1101/2022.08.01.502339 medRxiv

Top 0.1%

53.4%

Show abstract

Protein turnover is critical for proteostasis, but turnover quantification is challenging, and even in well-studied E. coli, proteome-wide measurements remain scarce. Here, we quantify the degradation rates of [~]3.2k E. coli proteins under 12 conditions by combining heavy isotope labeling with complement reporter ion quantification and find that cytoplasmic proteins are recycled when nitrogen is limited. We use knockout experiments to assign substrates to the known cytoplasmic ATP-dependent proteases. Surprisingly, none of these proteases are responsible for the observed cytoplasmic protein degradation in nitrogen limitation, suggesting that a major proteolysis pathway in E. coli remains to be discovered. Lastly, we show that protein degradation rates are generally independent of cell division rates. Thus, we introduce broadly applicable technology for protein turnover measurements and provide a rich resource for protein half-lives and protease substrates in E. coli, complementary to genomics data, that will allow researchers to decipher the control of proteostasis.

2

Inferring differential subcellular localisation in comparative spatial proteomics using BANDLE

Crook, O.; Davies, C. T. R.; Gatto, L.; Kirk, P. D.; Lilley, K. S.

2021-01-05 systems biology 10.1101/2021.01.04.425239 medRxiv

Top 0.1%

45.4%

Show abstract

The steady-state localisation of proteins provides vital insight into their function. These localisations are context specific with proteins translocating between different sub-cellular niches upon perturbation of the subcellular environment. Differential localisation, that is a change in the steady-state subcellular location of a protein, provides a step towards mechanistic insight of subcellular protein dynamics. Aberrant localisation has been implicated in a number of pathologies, thus differential localisation may help characterise disease states and facilitate rational drug discovery by suggesting novel targets. High-accuracy high-throughput mass spectrometry-based methods now exist to map the steady-state localisation and re-localisation of proteins. Here, we propose a principled Bayesian approach, BANDLE, that uses these data to compute the probability that a protein differentially localises upon cellular perturbation, as well quantifying the uncertainty in these estimates. Furthermore, BANDLE allows information to be shared across spatial proteomics datasets to improve statistical power. Extensive simulation studies demonstrate that BANDLE reduces the number of both type I and type II errors compared to existing approaches. Application of BANDLE to datasets studying EGF stimulation and AP-4 dependent localisation recovers well studied translocations, using only two-thirds of the provided data. Moreover, we potentially implicate TMEM199 with AP-4 dependent localisation. In an application to cytomegalovirus infection, we obtain novel insights into the rewiring of the host proteome. Integration of high-throughput transcriptomic and proteomic data, along with degradation assays, acetylation experiments and a cytomegalovirus intcractome allows us to provide the functional context of these data.

3

Phenotypic Heterogeneity in the DNA Replication Stress Response Revealed by Quantitative Protein Dynamics Measurements

Ho, B.; Loll-Krippleber, R.; Torres, N. P.; Cuny, A. P.; Rudolf, F.; Brown, G. W.

2022-06-08 systems biology 10.1101/2022.06.08.495346 medRxiv

Top 0.1%

44.0%

Show abstract

Cells respond to environmental stressors by activating programs that result in protein abundance and localization changes. The DNA damage and DNA replication stress responses have been heavily studied and provide exemplars of the roles of protein localization and abundance regulation in proper cellular stress response. While vast amounts of data have been collected to describe the dynamics of yeast proteins in response to numerous external stresses, few have assessed and compared both protein localization kinetics and phenotypic heterogeneity in the same context, particularly during DNA replication stress. We developed a robust yet simple quantification scheme to identify and measure protein localization change events (re-localization) and applied it to the 314 yeast proteins whose subcellular distribution changes following DNA replication stress. We captured different kinetics of protein re-localization, identified proteins with localization changes that were not detected in previous analyses, and defined the extent of heterogeneity in stress-induced protein re-localization. Our imaging platforms and analysis pipeline enables efficient measurements of protein localization phenotypes for single cells over time and will guide future work in elucidating the biological parameters that govern cellular heterogeneity.

4

Cell cycle-dependent protein dynamics in budding yeast resolved by deconvolution of bulk proteomics

Zylstra, A. J.; Rovetta, M.; Vedelaar, S.; Bleischwitz, C.; Fülleborn, J. A.; van Oppen, Y. B.; Markus, H. P.; Korbeld, K. T.; Milias-Argeitis, A.; Buczak, K.; Schmidt, A.; Heinemann, M.

2026-02-13 systems biology 10.64898/2026.02.12.705502 medRxiv

Top 0.1%

41.8%

Show abstract

The cell division cycle is characterised by oscillatory dynamics in regulatory mechanisms and biosynthesis, coordinated with genome replication and segregation. To understand these dynamics, quantitative cell cycle-dependent protein concentration data is essential. Unfortunately, accurate resolution of cell cycle-dependent protein dynamics is challenging because single-cell proteomics is currently infeasible and bulk proteomics requires inherently imperfect cell synchronisation. Here, we developed a computational method to deconvolve cell cycle-dependent protein concentration dynamics and applied it to new budding yeast bulk proteome data. Key to this method was a yeast population model, parameterised with experimental cell cycle progression and volume growth data, for quantifying the desynchronisation in sampled populations. We performed deconvolution on 3373 proteins, using cross-validation to determine regularisation parameters, and identified 563 proteins with cell cycle-dependent dynamics. Many of these dynamics were consistent with known yeast biology and dynamic proteins were enriched for several metabolic process, extending previous observations and supporting the emerging picture of metabolic activity as varying substantially over cell cycle phases. We consider the generated cell cycle-resolved budding yeast proteome data a key resource.

5

Mapping temperature-sensitive mutations at a genome-scale to engineer growth-switches in E. coli

Schramm, T.; Pahl, V.; Link, H.

2023-06-02 systems biology 10.1101/2023.06.01.543195 medRxiv

Top 0.1%

39.5%

Show abstract

Temperature-sensitive (TS) mutants are a unique tool to perturb and engineer cellular systems. Here, we constructed a CRISPR library with 15,120 Escherichia coli mutants, each with a single amino acid change in one of 346 essential proteins. 1,269 of these mutants showed temperature-sensitive growth in a time-resolved competition assay. We reconstructed 94 TS mutants and measured their metabolism under growth arrest at 42{degrees}C using metabolomics. Metabolome changes were strong and mutant-specific, showing that metabolism of non-growing E. coli is perturbation-dependent. For example, 24 TS mutants of metabolic enzymes overproduced the direct substrate-metabolite due to a bottleneck in their associated pathway. A strain with TS homoserine kinase (ThrBF267D) produced homoserine for 24 hours, and production was tunable by temperature. Finally, we used a TS subunit of DNA polymerase III (DnaXL289Q) to decouple growth from arginine overproduction in engineered E. coli. These results provide a strategy to identify TS mutants en masse and demonstrate their large potential to produce bacterial metabolites with non-growing cells.

6

The structural context of PTMs at a proteome wide scale

Bludau, I.; Willems, S.; Zeng, W.-F.; Strauss, M. T.; Hansen, F. M.; Tanzer, M. C.; Karayel, O.; Schulman, B. A.; Mann, M.

2022-02-24 systems biology 10.1101/2022.02.23.481596 medRxiv

Top 0.1%

39.1%

Show abstract

The recent revolution in computational protein structure prediction provides folding models for entire proteomes, which can now be integrated with large-scale experimental data. Mass spectrometry (MS)-based proteomics has identified and quantified tens of thousands of post-translational modifications (PTMs), most of them of uncertain functional relevance. In this study, we determine the structural context of these PTMs and investigate how this information can be leveraged to pinpoint potential regulatory sites. Our analysis uncovers global patterns of PTM occurrence across folded and intrinsically disordered regions. We found that this information can help to distinguish regulatory PTMs from those marking improperly folded proteins. Interestingly, the human proteome contains thousands of proteins that have large folded domains linked by short, unstructured regions that are strongly enriched in regulatory phosphosites. These include well-known kinase activation loops that induce protein conformational changes upon phosphorylation. This regulatory mechanism appears to be widespread in kinases but also occurs in other protein families such as solute carriers. It is not limited to phosphorylation but includes ubiquitination and acetylation sites as well. Furthermore, we performed three-dimensional proximity analysis which revealed examples of spatial co-regulation of different PTM types and potential PTM crosstalk. To enable the community to build upon these first analyses, we provide tools for 3D visualization of proteomics data and PTMs as well as python libraries for data accession and processing.

7

The Proteomic Landscape of Genome-Wide Genetic Perturbations

Messner, C. B.; Demichev, V.; Muenzner, J.; Aulakh, S.; Röhl, A.; Herrera-DomInguez, L.; Egger, A.-S.; Kamrad, S.; Lemke, O.; Calvani, E.; Mülleder, M.; Lilley, K. S.; Kustatscher, G.; Ralser, M.

2022-05-18 systems biology 10.1101/2022.05.17.492318 medRxiv

Top 0.1%

39.1%

Show abstract

Functional genomic strategies help to address the genotype phenotype problem by annotating gene function and regulatory networks. Here, we demonstrate that combining functional genomics with proteomics uncovers general principles of protein expression, and provides new avenues to annotate protein function. We recorded precise proteomes for all non-essential gene knock-outs in Saccharomyces cerevisiae. We find that protein abundance is driven by a complex interplay of i) general biological properties, including translation rate, turnover, and copy number variations, and ii) their genetic, metabolic and physical interactions, including membership in protein complexes. We further show that combining genetic perturbation with proteomics provides complementary dimensions of functional annotation: proteomic profiling, reverse proteomic profiling, profile similarity and protein covariation analysis. Thus, our study generates a resource in which nine million protein quantities are linked to 79% of the yeast coding genome, and shows that functional proteomics reveals principles that govern protein expression. Highlights- Nine million protein quantities recorded in ~4,600 non-essential gene deletions in S. cerevisiae reveal principles of how the proteome responds to genetic perturbation - Genome-scale protein expression is determined by both functional relationships between proteins, as well as common biological responses - Broad protein expression profiles in slow-growing strains can be explained by chromosomal aneuploidies - Protein half-life and ribosome occupancy are predictable from protein abundance changes across knock-outs - Functional proteomics annotates missing gene function in four complementary dimensions

8

Noise propagation shapes condition-dependent gene expression noise in Escherichia coli

Urchueguia, A.; Galbusera, L.; Bellement-Theroue, G.; Julou, T.; van Nimwegen, E. J.

2019-10-07 systems biology 10.1101/795369 medRxiv

Top 0.1%

38.8%

Show abstract

Although it is well appreciated that gene expression is inherently noisy and that transcriptional noise is encoded in a promoters sequence, little is known about the variation in transcriptional noise across growth conditions. Using flow cytometry we here quantify transcriptional noise in E. coli genome-wide across 8 growth conditions, and find that noise and gene regulation are intimately coupled. Apart from a growth-rate dependent lower bound on noise, we find that individual promoters show highly condition-dependent noise and that condition-dependent expression noise is shaped by noise propagation from regulators to their targets. A simple model of noise propagation identifies TFs that most contribute to both condition-specific and condition-independent noise propagation. The overall correlation structure of sequence and expression properties of E. coli genes uncovers that genes are organized along two principal axes, with the first axis sorting genes by their mean expression and evolutionary rate of their coding regions, and the second axis sorting genes by their expression noise, the number of regulatory inputs in their promoter, and their expression plasticity.

9

Predicted mechanistic impacts of human protein missense variants

Janes, J.; Muller, M.; Selvaraj, S.; Manoel, D.; Stephenson, J.; Goncalves, C.; Lafita, A.; Polacco, B.; Obernier, K.; Alasoo, K.; Lemos, M. C.; Krogan, N.; Martin, M.; Saraiva, L. R.; Burke, D.; Beltrao, P.

2024-05-29 bioinformatics 10.1101/2024.05.29.596373 medRxiv

Top 0.1%

33.5%

Show abstract

Genome sequencing efforts have led to the discovery of tens of millions of protein missense variants found in the human population with the majority of these having no annotated role and some likely contributing to trait variation and disease. Sequence-based artificial intelligence approaches have become highly accurate at predicting variants that are detrimental to the function of proteins but they do not inform on mechanisms of disruption. Here we combined sequence and structure-based methods to perform proteome-wide prediction of deleterious variants with information on their impact on protein stability, protein-protein interactions and small-molecule binding pockets. AlphaFold2 structures were used to predict approximately 100,000 small-molecule binding pockets and stability changes for over 200 million variants. To inform on protein-protein interfaces we used AlphaFold2 to predict structures for nearly 500,000 protein complexes. We illustrate the value of mechanism-aware variant effect predictions to study the relation between protein stability and abundance and the structural properties of interfaces underlying trans protein quantitative trait loci (pQTLs). We characterised the distribution of mechanistic impacts of protein variants found in patients and experimentally studied example disease linked variants in FGFR1.

10

Towards a systematic map of the functional role of protein phosphorylation

Vieitez, C.; Busby, B. P.; Ochoa, D.; Mateus, A.; Galardini, M.; Jawed, A.; Memon, D.; Potel, C. M.; Vonesch, S. C.; Szu Tu, C.; Shahraz, M.; Stein, F.; Steinmetz, L. M.; Savitski, M. M.; Typas, A.; Beltrao, P.

2019-12-11 systems biology 10.1101/872770 medRxiv

Top 0.1%

32.9%

Show abstract

Phosphorylation is a critical post-translational modification involved in the regulation of almost all cellular processes. However, less than 5% of thousands of recently discovered phosphorylation sites have a known function. Here, we devised a chemical genetic approach to study the functional relevance of phosphorylation in S. cerevisiae. We generated 474 phospho-deficient mutants that, along with the gene deletion library, were screened for fitness in 102 conditions. Of these, 42% exhibited growth phenotypes, suggesting these phosphosites are likely functional. We inferred their function based on the similarity of their growth profiles with that of gene deletions, and validated a subset by thermal proteome profiling and lipidomics. While some phosphomutants showed loss-of-function phenotypes, a higher fraction exhibited phenotypes not seen in the corresponding gene deletion suggestive of a gain-of-function effect. For phosphosites conserved in humans, the severity of the yeast phenotypes is indicative of their human functional relevance. This study provides a roadmap for functionally characterizing phosphorylation in a systematic manner.

11

Integration of proteomics with genomics and transcriptomics increases the diagnostic rate of Mendelian disorders

Kopajtich, R.; Smirnov, D.; Stenton, S. L.; Loipfinger, S.; Meng, C.; Scheller, I.; Freisinger, P.; Baski, R.; Berutti, R.; Behr, J.; Bucher, M.; Distelmaier, F.; Gusic, M.; Hempel, M.; Kulterer, L.; Mayr, H.; Meitinger, T.; Mertes, C.; Metodiev, M.; Nasca, A.; Nadel, A.; Ohtake, A.; Okazaki, Y.; Olsen, R.; Piekutowska-Abramczuk, D.; Roetig, A.; Santer, R.; Schindler, D.; Slama, A.; Staufner, C.; Strom, T.; Verloo, P.; von Kleist-Retzow, J.-C.; Wortmann, S.; Yepez, V.; Lamperti, C.; Ghezzi, D.; Murayama, K.; Ludwig, C.; Gagneur, J.; Prokisch, H.

2021-03-12 genetic and genomic medicine 10.1101/2021.03.09.21253187 medRxiv

Top 0.1%

32.8%

Show abstract

By lack of functional evidence, genome-based diagnostic rates cap at approximately 50% across diverse Mendelian diseases. Here, we demonstrate the effectiveness of combining genomics, transcriptomics, and, for the first time, proteomics and phenotypic descriptors, in a systematic diagnostic approach to discover the genetic cause of mitochondrial diseases. On fibroblast cell lines from 145 individuals, tandem mass tag labelled proteomics detected approximately 8,000 proteins per sample and covered over 50% of all Mendelian disease-associated genes. Aberrant protein expression analysis allowed the validation of candidate protein-destabilising variants, in addition to providing independent complementary functional evidence to variants leading to aberrant RNA expression. Overall, our integrative computational workflow led to genetic resolution for 22% of 121 genetically unsolved whole exome or whole genome negative cases and to the discovery of two novel disease genes. With increasing democratization of high-throughput omics assays, our approach and code provide a blueprint for implementing multi-omics based Mendelian disease diagnostics in routine clinical practice.

12

Proteome dynamics of COVID-19 severity learnt by a graph convolutional network of multi-scale topology

Gauthier, S.; Tran-Dinh, A.; Morilla, I.

2022-07-05 systems biology 10.1101/2022.07.04.498661 medRxiv

Top 0.1%

32.7%

Show abstract

Many efforts have been recently done to characterise the molecular mechanisms of COVID-19 disease. These efforts resulted in a full structural identification of ACE2 as principal receptor of the Sars-CoV-2 spike protein in the cell. However, there are still important open questions related to other proteins involved in the progression of the disease. To this end, we have modelled the plasma proteome of 384 COVID patients. The model calibrated proteins measures at three time tags and make also use of the detailed clinical evaluation outcome of each patient after their hospital stay at day 28. Our analysis is able to discriminate severity of the disease by means of a metric based on available WHO scores of disease progression. Then, we identify by topological vectorisation those proteins shifting the most in their expression depending on that severity classification. Finally, the extracted topological invariants respect the protein expression at different times were used as base of a graph convolutional network. This model enabled the dynamical learning of the molecular interactions produced between the identified proteins.

13

Protein degradation and growth dependent dilution substantially shape mammalian proteomes

Leduc, A.; Slavov, N.

2025-02-12 systems biology 10.1101/2025.02.10.637566 medRxiv

Top 0.1%

32.5%

Show abstract

Cellular protein concentrations are controlled by rates of synthesis and clearance, the lat-ter including protein degradation and dilution due to growth. Thus, cell growth rate may influence the mechanisms controlling variation in protein concentrations. To quantify this influence, we analyzed the growth-dependent effects of protein degradation within a cell type (between activated and resting human B-cells), across human cell types and mouse tissues. This analysis benefited from deep and accurate quantification of over 12,000 proteins across four primary tissues using plexDIA. The results indicate that growth-dependent dilution can account for 40 % of protein concentration changes across conditions. Furthermore, we find that the variation in protein degradation rates is sufficient to account for up to 50 % of the variation in concentrations within slowly growing cells as contrasted with 7 % in growing cells. Remarkably, degradation rates differ significantly between proteoforms encoded by the same gene and arising from alternative splicing or alternate RNA decoding. These proteoform-specific degradation rates substantially determine the proteoform abundance, especially in the brain. Thus, our model and data unify previous observations with our new results and demonstrate substantially larger than previously appreciated contributions of protein degradation to protein variation at slow growth, both across proteoforms and tissue types.

14

Transcriptional signatures of cell-cell interactions are dependent on cellular context

Innes, B. T.; Bader, G. D.

2021-09-06 systems biology 10.1101/2021.09.06.459134 medRxiv

Top 0.1%

31.3%

Show abstract

Cell-cell interactions are often predicted from single-cell transcriptomics data based on observing receptor and corresponding ligand transcripts in cells. These predictions could theoretically be improved by inspecting the transcriptome of the receptor cell for evidence of gene expression changes in response to the ligand. It is commonly expected that a given receptor, in response to ligand activation, will have a characteristic downstream gene expression signature. However, this assumption has not been well tested. We used ligand perturbation data from both the high-throughput Connectivity Map resource and published transcriptomic assays of cell lines and purified cell populations to determine whether ligand signals have unique and generalizable transcriptional signatures across biological conditions. Most of the receptors we analyzed did not have such characteristic gene expression signatures - instead these signatures were highly dependent on cell type. Cell context is thus important when considering transcriptomic evidence of ligand signaling, which makes it challenging to build generalizable ligand-receptor interaction signatures to improve cell-cell interaction predictions.

15

Robotic perturbation proteomics and AI agents enable scalable drug mechanism discovery

Jiang, Y.; Movassaghi, C. S.; Munoz-Estrada, J.; Sundararaman, N.; Momenzadeh, A.; Meyer, J. G.

2026-05-07 systems biology 10.64898/2026.05.04.722718 medRxiv

Top 0.1%

30.8%

Show abstract

Large-scale mass spectrometry-based proteomic screening could reveal cellular mechanisms of drug action at systems resolution but remains limited by experimental complexity and the difficulty of extracting insight from high-dimensional datasets. Here, we describe an end-to-end platform that combines semi-automated sample preparation, rapid LC-MS/MS, and AI agent-based data analysis to enable scalable proteomic screening. In a screen of 172 compounds in HepG2 cells, we generated 1,232 proteomes with more than 8,700 quantified proteins in approximately three weeks. Agentic AI reduced data analysis and interpretation time to less than one day while translating proteomic measurements into structured mechanism-oriented summaries and experimentally testable hypotheses. Guided by this framework, we validated: (1) a cholesterol-lowering effect of methylene blue in vitro and (2) an association between loratadine exposure and increased circulating iron in matched electronic health record analyses. This work establishes a scalable platform for generating proteomic drug perturbation data and automatically converting that data into mechanistic insights and candidate translational hypotheses using AI.

16

Disagreement among variant effect predictors guides experimental prioritization of target proteins

Jonsson, N. F.; Marsh, J. A.; Lindorff-Larsen, K.

2026-03-20 bioinformatics 10.64898/2026.03.18.712765 medRxiv

Top 0.1%

30.5%

Show abstract

Interpreting the functional consequences of genetic variation, especially rare missense variants, remains a significant challenge in human genetics. Computational variant effect predictors (VEPs) and multiplexed assays of variant effects (MAVEs) provide complementary approaches, with VEPs offering scalable predictions and MAVEs delivering detailed empirical measurements. However, MAVEs are resource intensive and cannot yet be applied broadly across the proteome, making it important to identify proteins where experimental mapping will be most informative. We hypothesised that MAVEs should be particularly valuable for proteins where computational predictors disagree, as such disagreement may highlight mechanistic blind spots. To test this, we analysed predictions from ten distinct VEPs across more than 13,000 human proteins and quantified inter-predictor concordance. We observed substantial variability across proteins in the degree of agreement across predictors and investigated structural, functional and gene-level features associated with this variation. We find that inter-VEP concordance showed no relationship with agreement to experimental MAVE data. If predictor agreement reflected how intrinsically predictable a protein is, these quantities would be expected to correlate. Their decoupling instead suggests that MAVEs may provide orthogonal information to VEPs, supporting the use of inter-VEP disagreement to prioritise proteins where experimental data will be most informative. We therefore propose using inter-VEP disagreement as a practical strategy to prioritise proteins for experimental characterization. Focusing on proteins with low predictor concordance should maximise the informational value of new MAVEs, and improve variant interpretation in both research and clinical contexts.

17

Systematic protein complex profiling and differential analysis from co-fractionation mass spectrometry data

Fossati, A.; Li, C.; Sykacek, P.; Heusel, M.; Frommelt, F.; Uliana, F.; Hallal, M.; Bludau, I.; Klemens, C. T.; Xue, P.; Purcell, A. W.; Gstaiger, M.; Aebersold, R.

2020-05-07 systems biology 10.1101/2020.05.06.080465 medRxiv

Top 0.1%

27.2%

Show abstract

Protein complexes, macro-molecular assemblies of two or more proteins, play vital roles in numerous cellular activities and collectively determine the cellular state. Despite the availability of a range of methods for analysing protein complexes, systematic analysis of complexes under multiple conditions has remained challenging. Approaches based on biochemical fractionation of intact, native complexes and correlation of protein profiles have shown promise, for instance in the combination of size exclusion chromatography (SEC) with accurate protein quantification by SWATH/DIA-MS. However, most approaches for interpreting co-fractionation datasets to yield complex composition, abundance and rearrangements between samples depend heavily on prior evidence. We introduce PCprophet, a computational framework to identify novel protein complexes from SEC-SWATH-MS data and to characterize their changes across different experimental conditions. We demonstrate accurate prediction of protein complexes (AUC >0.99 and accuracy around 97%) via five-fold cross-validation on SEC-SWATH-MS data, show improved performance over state-of-the-art approaches on multiple annotated co-fractionation datasets, and describe a Bayesian approach to analyse altered protein-protein interactions across conditions. PCprophet is a generic computational tool consisting of modules for data pre-processing, hypothesis generation, machine-learning prediction, post-prediction processing, and differential analysis. It can be applied to any co-fractionation MS dataset, independent of separation or quantitative LC-MS workflow employed, and to support the detection and quantitative tracking of novel protein complexes and their physiological dynamics.

18

Omics-Based Interaction Framework - a systems model to reveal molecular drivers of synergy

Pantaleon Garcia, J.; Kulkarni, V. V.; Reese, T. C.; Wali, S.; Wase, S. J.; Zhang, J.; Singh, R.; Caetano, M. S.; Moghaddam, S. J.; Johnson, F. M.; Wang, J.; Wang, Y.; Evans, S. E.

2020-04-18 systems biology 10.1101/2020.04.16.041350 medRxiv

Top 0.1%

27.0%

Show abstract

Bioactive molecule library screening strategies may empirically identify effective combination therapies. However, without a systems theory to interrogate synergistic responses, the molecular mechanisms underlying favorable drug-drug interactions remain unclear, precluding rational design of combination therapies. Here, we introduce Omics-Based Interaction Framework (OBIF) to reveal molecular drivers of synergy through integration of statistical and biological interactions in supra-additive biological responses. OBIF performs full factorial analysis of feature expression data from single vs. dual factor exposures to identify molecular clusters that reveal synergy-mediating pathways, functions and regulators. As a practical demonstration, OBIF analyzed a therapeutic dyad of immunostimulatory small molecules that induces synergistic protection against influenza A pneumonia. OBIF analysis of transcriptomic and proteomic data identified biologically relevant, unanticipated cooperation between RelA and cJun that we subsequently confirmed to be required for the synergistic antiviral protection. To demonstrate generalizability, OBIF was applied to data from a diverse array of Omics platforms and experimental conditions, successfully identifying the molecular clusters driving their synergistic responses. Hence, OBIF is a phenotype-driven systems model that supports multiplatform exploration of synergy mechanisms.

19

The balance of acidic and hydrophobic residues predicts acidic transcriptional activation domains from protein sequence

Kotha, S. R.; Staller, M. V.

2023-02-11 systems biology 10.1101/2023.02.10.528081 medRxiv

Top 0.1%

26.6%

Show abstract

Transcription factors activate gene expression in development, homeostasis, and stress with DNA binding domains and activation domains. Although there exist excellent computational models for predicting DNA binding domains from protein sequence (Stormo, 2013), models for predicting activation domains from protein sequence have lagged behind (Erijman et al., 2020; Ravarani et al., 2018; Sanborn et al., 2021), particularly in metazoans. We recently developed a simple and accurate predictor of acidic activation domains on human transcription factors (Staller et al., 2022). Here, we show how the accuracy of this human predictor arises from the balance between hydrophobic and acidic residues, which together are necessary for acidic activation domain function. When we combine our predictor with the predictions of neural network models trained in yeast, the intersection is more predictive than individual models, emphasizing that each approach carries orthogonal information. We synthesize these findings into a new set of activation domain predictions on human transcription factors.

20

Organellomics: AI-driven deep organellar phenotyping reveals novel ALS mechanisms in human neurons

Krispin, S.; van Zuiden, W.; Danino, Y. M.; Molitor, L.; Rudberg, N.; Bar, C.; Coyne, A.; Meimoun, T.; Waldron, F. M.; Gregory, J. M.; Fisher, T.; Nachshon, A.; Stern-Ginossar, N.; Yacovzada, N. S.; Hornstein, E.

2025-01-29 systems biology 10.1101/2024.01.31.572110 medRxiv

Top 0.1%

26.2%

Show abstract

Systematic assessment of organelle architectures, termed the organellome, offers valuable insights into cellular states and pathomechanisms, but remains largely uncharted. Here, we present a deep phenotypic learning based on vision transformers, resulting in the Neuronal Organellomics Vision Atlas (NOVA) model that studies confocal images of more than 30 markers of distinct membrane-bound and membraneless organelles in 11.5 million images of human neurons. Organellomics analysis quantifies perturbation-induced changes in organelle localization and morphology using a rigorous mixed-effects meta-analytic framework that accounts for sampling variance and experimental heterogeneity. Applying this approach, we delineate phenotypic alterations in neurons carrying ALS-associated mutations and uncover a physical and functional crosstalk between cytoplasmic mislocalized TDP-43, a hallmark of ALS, and processing bodies (P-bodies), membraneless organelles regulating mRNA stability. These findings are validated in patient-derived neurons and human neuropathology. NOVA establishes a scalable framework for quantitative mapping of subcellular phenotypes and provides a new avenue for investigating the neurocellular basis of disease.